Automatically Generated Keywords: A Comparison to Author Generated Keywords in the Sciences

نویسنده

  • C. D. Hurt
چکیده

This paper examines the differences between author generated keywords and automatically generated keywords in one area of scientific and technical literature. Using inverse frequency, keywords produced using both methods are examined using a maximum likelihood algorithm. By reducing the scope and size of the corpus of literature examined, this study more closely emulates the information gathering processes of scientists and technologists. Care was taken in developing the sample used, balancing statistical factors to allow interpretable outcomes and replication. The results of the study indicated there are no statistically significant differences between the two techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Generating Natural Language Question-Answer Pairs from a Knowledge Graph Using a RNN Based Question Generation Model

In recent years, knowledge graphs such as Freebase that capture facts about entities and relationships between them have been used actively for answering factoid questions. In this paper, we explore the problem of automatically generating question answer pairs from a given knowledge graph. The generated question answer (QA) pairs can be used in several downstream applications. For example, they...

متن کامل

Exploring the Value of Folksonomies for Creating Semantic Metadata

Finding good keywords to describe resources is an on-going problem. Typically, we select such words manually from a thesaurus of terms, or they are created using automatic keyword extraction techniques. Folksonomies are an increasingly well-populated source of unstructured tags describing Web resources. This article explores the value of the folksonomy tags as a potential source of keyword meta...

متن کامل

بررسی میزان تطابق زبان نمایه‌سازان، نویسندگان و برچسب‌گذاران در پایگاه اطلاعاتی اریک و مندلی

Objective: The purpose of this study was to identify the language consistency between indexers, authors and taggers in the ERIC and Mendeley databases. Methodology: This survey was conducted using content analysis methods and techniques to evaluate the language consistency between indexers, authors and taggers in the ERIC and Mendeley databases and also to determine common keywords. The sample ...

متن کامل

Multimedia surrogates for video gisting: Toward combining spoken words and imagery

Good surrogates that allow people to quickly derive the gist of videos without taking the time to view the full video are crucial to video retrieval and browsing systems. Although there are many kinds of textual and visual surrogates used in video retrieval systems, there are few audio surrogates in practice. To evaluate the effectiveness of audio surrogates alone and in combination with one ki...

متن کامل

University of Chicago at the CLEF 2007 Cross Language Speech Retrieval Track

The University of Chicago participated in the CLEF 2007 CL-SR track, performing monolingual retrieval for both English and Czech and cross-language French-English retrieval. English experiments considered the impact of automatically generated keywords on retrieval. Czech experiments explored the effect of different stemming approaches on retrieval for this morphologically rich language. The bes...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010